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Abstract 

We study the problem of fitting parametrized curves to noisy data. Under 
certain assumptions (known as Cartesian and radial functional models), we derive 
asymptotic expressions for the bias and the covariance matrix of the parameter 
estimates. We also extend Kanatani's version of the Cramer-Rao lower bound, 
which he proved for unbiased estimates only, to more general estimates that include 
many popular algorithms (most notably, the orthogonal least squares and algebraic 
fits). We then show that the gradient- weighted algebraic fit is statistically efficient 
and describe all other statistically efficient algebraic fits. 

Keywords: least squares fit, curve fitting, circle fitting, algebraic fit, Rao-Cramer 

bound, efficiency, functional model. 

1 Introduction 

In many applications one fits a parametrized curve described by an implicit equation 
P(x,y; 0) = to experimental data (xi,yi), i = 1, . . . ,n. Here G denotes the vector of 
unknown parameters to be estimated. Typically, P is a polynomial in x and y, and its 
coefficients are unknown parameters (or functions of unknown parameters). For example, 
a number of recent publications [21 EH El E3 El] are devoted to the problem of fitting 
quadrics Ax 2 + Bxy + Cy 2 + Dx + Ey + F = 0, in which case 6 = (A, B, C, D, E, F) is the 
parameter vector. The problem of fitting circles, given by equation (x— a) 2 +(y— b) 2 — R 2 = 
with three parameters a, b, R, also attracted attention [HI HU Ho] ITS]. 

We consider here the problem of fitting general curves given by implicit equations 
P(x,y;Q) = with = (# 1; . . . , #&) being the parameter vector. Our goal is to in- 
vestigate statistical properties of various fitting algorithms. We are interested in their 
biasedness, covariance matrices, and the Cramer-Rao lower bound. 
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First, we specify our model. We denote by the true value of 6. Let (xi,yi), 
i = 1, ...,n, be some points lying on the true curve P(x,y;Q) = 0. Experimentally 
observed data points (xi,yi), i — 1, . . . ,n, are perceived as random perturbations of the 
true points (xi,y~i). We use notation Xj = (xi,yi) T and Xj = (xi,y~i) T , for brevity. The 
random vectors = x« — Xj are assumed to be independent and have zero mean. Two 
specific assumptions on their probability distribution can be made, see jl]: 

Cartesian model: Each e« is a two-dimensional normal vector with covariance matrix 
afl, where / is the identity matrix. 

Radial model: = where & is a normal random variable A/"(0, erf), and n; is a 
unit normal vector to the curve P(x, y; 0) = at the point Xj. 

Our analysis covers both models, Cartesian and radial. For simplicity, we assume that 
of = o 2 for all i, but note that our results can be easily generalized to arbitrary of > 0. 

Concerning the true points Xj, i = l,...,n, two assumptions are possible. Many 
researchers EH El consider them as fixed, but unknown, points on the true curve. 
In this case their coordinates (x iy yi) can be treated as additional parameters of the 
model (nuisance parameters). Chan [(J, and others jSlll] call this assumption & functional 
model. Alternatively, one can assume that the true points Xj are sampled from the curve 
P(x,y;Q) = according to some probability distribution on it. This assumption is 
referred to as a structural model [Hill]. We only consider the functional model here. 

It is easy to verify that maximum likelihood estimation of the parameter for the 
functional model is given by the orthogonal least squares fit (OLSF), which is based on 
minimization of the function 

^(o) = XXo)] 2 (i.i) 

i=l 

where tij(O) denotes the distance from the point Xj to the curve P(x,y;Q) = 0. The 
OLSF is the method of choice in practice, especially when one fits simple curves such 
as lines and circles. However, for more general curves the OLSF becomes intractable, 
because the precise distance di is hard to compute. For example, when P is a generic 
quadric (ellipse or hyperbola), the computation of di is equivalent to solving a polynomial 
equation of degree four, and its direct solution is known to be numerically unstable, see 
pH E] for more detail. Then one resorts to various approximations. It is often convenient 
to minimize 

n 

^ 2 (©)=£[P(a;,,,y,-e)] 2 (1.2) 
i=i 

instead of (jl.lj) . This method is referred to as a (simple) algebraic fit (AF), in this case 
one calls \P{x^ y^, 0)| the algebraic distance (2111011111 from the point (xj,?/j) to the 
curve. The AF is computationally cheaper than the OLSF, but its accuracy is often 
unacceptable, see below. 
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The simple AF ()1.2|) can be generalized to a weighted algebraic fit, which is based on 
minimization of 

^ 3 (©) = E^P(« ; e)] 2 (i.3) 

i=l 

where Wi = w(xi,yi]Q) are some weights, which may balance p. 2)1 and improve its 
performance. One way to define weights Wi results from a linear approximation to df 



d; 



|P(x^;Q)| 

IVxPOr^e)!! 



where V X P = (dP/dx,dP/dy) is the gradient vector, see (201- Then one minimizes the 
function 

(L4) 

This method is called the gradient weighted algebraic fit (GRAF). It is a particular case 
of (JOD with Wi = l/||V x P(a; i ,?/ i ;e)|| 2 . 

The GRAF is known since at least 1974 [21] and recently became standard for polyno- 
mial curve fitting [201 EH QH] ■ The computational cost of GRAF depends on the function 
P(x,y;Q), but, generally, the GRAF is much faster than the OLSF. It is also known 
from practice that the accuracy of GRAF is almost as good as that of the OLSF, and 
our analysis below confirms this fact. The GRAF is often claimed to be a statistically 
optimal weighted algebraic fit, and we will prove this fact as well. 

Not much has been published on statistical properties of the OLSF and algebraic 
fits, apart from the simplest case of fitting lines and hyperplanes [12]. Chan [H], Berman 
and Culpin J3] investigated circle fitting by the OLSF and the simple algebraic fit (jl.2j) 
assuming the structural model. Kanatani [TB*] ITi] used the Cartesian functional model 
and considered a general curve fitting problem. He established an analogue of the Rao- 
Cramer lower bound for unbiased estimates of O, which we call here Kanatani- Cramer- 
Rao (KCR) lower bound. He also showed that the covariance matrices of the OLSF and 
the GRAF attain, to the leading order in a, his lower bound. We note, however, that in 
most cases the OLSF and algebraic fits are biased 0JE], hence the KCR lower bound, as 
it is derived in (TSl EH] , does not immediately apply to these methods. 

In this paper we extend the KCR lower bound to biased estimates, which include the 
OLSF and all weighted algebraic fits. We prove the KCR bound for estimates satisfying 
the following mild assumption: 

Precision assumption. For precise observations (when x 4 = 5q for all 1 < i < n), the 
estimate is precise, i.e. 

0( Xl ,...,x n ) = 6 (1.5) 

It is easy to check that the OLSF and algebraic fits (jl.3j) satisfy this assumption. We 
will also show that all unbiased estimates of satisfy (|1.5j) . 

We then prove that the GRAF is, indeed, a statistically efficient fit, in the sense that 
its covariance matrix attains, to the leading order in a, the KCR lower bound. On the 
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other hand, rather surprisingly, we find that GRAF is not the only statistically efficient 
algebraic fit, and we describe all statistically efficient algebraic fits. Finally, we show that 
Kanatani's theory and our extension to it remain valid for the radial functional model. 
Our conclusions are illustrated by numerical experiments on circle fitting algorithms. 

2 Kanatani- Cramer- Rao lower bound 

Recall that we have adopted the functional model, in which the true points Xj, 1 < i < 
n, are fixed. This automatically makes the sample size n fixed, hence, many classical 
concepts of statistics, such as consistency and asymptotic efficiency (which require taking 
the limit n — ► oo) lose their meaning. It is customary, in the studies of the functional 
model of the curve fitting problem, to take the limit a — > instead of n — > oo, cf. 
|13) I14j. This is, by the way, not unreasonable from the practical point of view: in many 
experiments, n is rather small and cannot be (easily) increased, so the limit n — > oo is 
of little interest. On the other hand, when the accuracy of experimental observations is 
high (thus, a is small), the limit a — > is quite appropriate. 

Now, let 0(xi, . . . , x n ) be an arbitrary estimate of O satisfying the precision assump- 
tion (jl.5j) . In our analysis we will always assume that all the underlying functions are 
regular (continuous, have finite derivatives, etc.), which is a standard assumption [131 Ej. 

The mean value of the estimate O is 

£(©) = [■■■ /e( Xl ,...,x n ) f[f(x i )cbt 1 .--dx n (2.1) 

i=l 

where /(xj) is the probability density function for the random point Xj, as specified by 
a particular model (Cartesian or radial). 

We now expand the estimate 0(x 1; . . . ,x n ) into a Taylor series about the true point 
(xi, . . . , x n ) remembering ([1.5)1 : 

n 

6( Xl , . . . ,x n ) = + £ 6, x (xi - Xi) + 0(a 2 ) (2.2) 

i=l 

where 

©i = V Xi O(xi, . . . ,x n ), i = l,...,n (2.3) 

and V Xi stands for the gradient with respect to the variables Xi,yi. In other words, Oj is 
a k x 2 matrix of partial derivatives of the k components of the function © with respect 
to the two variables x« and and this derivative is taken at the point (xi, . . . , x n ), 
Substituting the expansion ()2.2|) into 1)2.1)) gives 

E{Q) = © + 0(a 2 ) (2.4) 

since J5(xj — Xj) = 0. Hence, the bias of the estimate © is of order a 2 . 
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It easily follows from the expansion (|2.2|) that the covariance matrix of the estimate 
is given by 

n 

i=i 

(it is not hard to see that the cubical terms 0(a 3 ) vanish because the normal random 
variables with zero mean also have zero third moment, see also [13J). Now, for the 
Cartesian model 

E[(x i -%)(x i -x i ) T ]=o*I 

and for the radial model 

E[(xi - Xj)(xj - Xi) T ] = crVnf 

where rij is a unit normal vector to the curve P(x, y;Q) = at the point Xj. Then we 
obtain 

Ce = a 2 X:e j A J 6f + 0(a A ) (2.5) 
i=i 

where A* = I for the Cartesian model and Aj = mnf for the radial model. 

Lemma. We have OjmnfOf = OjOf for each i = 1, . . . ,n. Hence, for both models, 
Cartesian and radial, the matrix Cq is given by the same expression: 

n 

C 6 = a a X;e i ef + 0(a 4 ) (2.6) 
i=i 

This lemma is proved in Appendix. 

Our next goal is now to find a lower bound for the matrix 

n 

p i: =5> t ef (2.7) 

i=l 

Following > we consider perturbations of the parameter vector + 59 and the 

true points 5q + 5x,j satisfying two constraints. First, since the true points must belong 
to the true curve, P(x$; 9) = 0, we obtain, by the chain rule, 

(V x P(x, ; 9), 5x 4 ) + (V e P(x,; 9), 5Q) = (2.8) 

where (•, •) stands for the scalar product of vectors. Second, since the identity (jl.5|) holds 
for all 9, we get 

X)e < <5x i = (5e (2.9) 

8=1 

by using the notation (|2.3J) . 
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Now we need to find a lower bound for the matrix (|2.7|) subject to the constraints 
()2.8j) and (|2.9|) . That bound follows from a general theorem in linear algebra: 



Theorem (Linear Algebra). Let n > k > 1 and m > 1. Suppose n nonzero vectors 
Ui G R m and n nonzero vectors Vi G R fc are given, 1 < i < n. Consider k x m matrices 



Y - 

rp 

ufUi 



for 1 < i < n, and k x k matrix 



T 

i=l i=l U i Ui 



Assume that the vectors Vi,...,v n span R fc (hence B is nonsingular) . We say that a set 
of n matrices Ai, . . . , A n ( each of size k x m) is proper if 



n 



Y, A iWi=r (2.10) 
i=i 

for any vectors W{ G R m and r G R fc such that 

ujwi + vfr = (2.11) 

for all 1 < i < n. Then for any proper set of matrices Ai, . . . ,A n the k x k matrix 
D = Yh=i AiAj is bounded from below by B~ x in the sense that D — B^ 1 is a positive 
semidefinite matrix. The equality D = B^ 1 holds if and only if Ai = —B~ x Xi for all 
i = 1, . . . , n. 

This theorem is, probably, known, but we provide a full proof in Appendix, for the 
sake of completeness. 

As a direct consequence of the above theorem we obtain the lower bound for our 
matrix T>\\ 

Theorem (Kanatani-Cramer-Rao lower bound). We have T>\ > T> min , in the sense 
that T>i — X> m i n is a positive semidefinite matrix, where 

^ (V e P(x t ;e))(V e P(x t ;9)f 

min ~k ||v x p(x i; e)||3 {Z - U) 

In view of f!2.6|) and ([2.7)1 . the above theorem says that the lower bound for the 
covariance matrix C e is, to the leading order, 

Cq > C m ; n = (J ©min (2.13) 
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The standard deviations of the components of the estimate 6 are of order ctq = 0(a). 
Therefore, the bias of 0, which is at most of order a 2 by ()2.4jl . is infinitesimally small, as 
cr — > 0, compared to the standard deviations. This means that the estimates satisfying 
fjl.5|) are practically unbiased. 

The bound ()2.13)) was first derived by Kanatani [TB*) IT4") for the Cartesian functional 
model and strictly unbiased estimates of 0, i.e. satisfying E(Q) = 9. One can easily 
derive (jl.5j) from E(Q) = 9 by taking the limit a — > 0, hence our results generalize those 
of Kanatani. 



3 Statistical efficiency of algebraic fits 



Here we derive an explicit formula for the covariance matrix of the weighted algebraic fit 
(11.3)1 and describe the weights Wi for which the fit is statistically efficient. For brevity, 
we write Pi = P(xi, yi, 9). We assume that the weight function w(x, y, ; 9) is regular, in 
particular has bounded derivatives with respect to 9, the next section will demonstrate 
the importance of this condition. The solution of the minimization problem (jl.3)l satisfies 



J2 Pi Vewi + 2j2wiPi V @ Pi = 



(3.1) 



Observe that Pj = 0(a), so that the first sum in ()3.1)1 is 0(a 2 ) and the second sum is 
C(cr). Hence, to the leading order, the solution of ()3.1j) can be found by discarding the 
first sum and solving the reduced equation 



Y,WiPi\7eP = 



(3.2) 



More precisely, if 9i and G>2 are solutions of ()3.1j) and ()3.2j) . respectively, then 9i — 9 = 
0(a), G>2 — 9 = 0(a), and ||@i — O2II = 0(a 2 ). Furthermore, the covariance matrices of 
9i and 9 2 coincide, to the leading order, i.e. C^C^ — > I as a — > 0. Therefore, in what 
follows, we only deal with the solution of equation ([3.2)1 . 

To find the covariance matrix of 9 satisfying 1)3.2)1 we put 9 = 9+59 and Xj = Xj+^x, 
and obtain, working to the leading order, 



^(VePXVePf (59) = -^(V^f (5^) (V e P) + 0(a 2 



hence 



50 = - [j2wi(V e P i )(V e P i ) T } 1 [E^(VxP) T (Sxi) (V e Pi)] + 0(a 2 ) 
The covariance matrix is then 

Ce = E[(5Q)(5Qf 

= a 2 [X)«; i (VeP<)(VePi) T ]" 1 [E^llVxPll^VeP)^,) 
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Denote by T> 2 the principal factor here, i.e. 

^2 = [E^VePiXVePifp 1 [£w 2 ||V x P|| 2 (V P)(VeP) T ] [5> t (V P)(V e P) 
The following theorem establishes a lower bound for T> 2 : 

Theorem. We have T> 2 > T^min, i> n the sense that T> 2 — T> min is a positive semide finite 
matrix, where T> min is given by \2.1ty) . The equality T> 2 = T> min holds if and only if Wi = 
const/ 1| V x Pi || 2 for alii — 1, ... ,n. In other words, an algebraic fit M.ty is statistically 
efficient if and only if the weight function w(x, y; 6) satisfies 

for all triples x,y, G such that P(x,y; G) = 0. Here c(Q) may be an arbitrary function 

fe. 

The bound T> 2 > l^min here is a particular case of the previous theorem. It also can 
be obtained directly from the linear algebra theorem if one sets Ui = V x Pj, V{ = V Pi, 
and 

n 

Ai = - Wl J2 w i(V e P J )(V & P j ) T 

3=1 



-1 

(V P) (V X P) T 



for 1 < i < n. 

The expression f!3.3j) characterizing the efficiency, follows from the last claim in the 
linear algebra theorem. 

4 Circle fit 

Here we illustrate our conclusions by the relatively simple problem of fitting circles. The 
canonical equation of a circle is 

(x - a) 2 + (y - b) 2 - R 2 = (4.1) 

and we need to estimate three parameters a, b, R. The simple algebraic fit (jl.2j) takes 
form 

n 

F 2 (a,b,R)=J2l&-a) 2 + (yi-b) 2 -R 2 } 2 -> min (4.2) 

i=l 

and the weighted algebraic fit (jl.3j) takes form 

n 

^ 3 (a,b,R) = Y / ^i[(xi-a) 2 + (y i -b) 2 -R 2 ] 2 -> min (4.3) 



i=i 
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In particular, the GRAF becomes 



^■ fl >-g (*,-„)» + (w -6)» ' mm (4 ' 4) 

(where the irrelevant constant factor of 4 in the denominator is dropped). 
In terms of (|2.12|) . we have 

V P(x i; 0) = -2{xi -a,Vi- b, R) T 

and V x P(xi] @) = 2(xi — a,y~i — b) T , hence 

||V x P(x,; 0) || 2 = 4[(z< - a) 2 + (ft - b) 2 } = AR 2 

Therefore, 



where we denote, for brevity, 



U; 



Ev 2 Evi | (1.0) 

- a ft - 6 

= , Vj = 

R ' P 



The above expression for XVm was derived earlier in E] • 

Now, our Theorem in Section |3] shows that the weighted algebraic fit ()4.3|) is statisti- 
cally efficient if and only if the weight function satisfies w(x, y; a, b, R) = c(a, b, R) / {AR 2 ). 
Since c(a, b, R) may be an arbitrary function, then the denominator AR 2 here is irrelevant. 
Hence, statistically efficiency is achieved whenever w(x,y; a,b, R) is simply independent 
of x and y for all (x, y) lying on the circle. In particular, the GRAF f)4.4j) is statistically 
efficient because w(x, y\ a, b, R) = [(x — a) 2 + (y — b) 2 ]^ 1 = Rr 2 . The simple AF (|4.2|) is 
also statistically efficient since w(x,y;a,b, R) = 1. 

We note that the GRAF (J4.4j) is a highly nonlinear problem, and in its exact form 
(I4.4j) is not used in practice. Instead, there are two modifications of GRAF popular 
among experimenters. One is due to Chernov and Ososkov jH] and Pratt |17j : 



n 



F' 4 (a,b,R)=R- 2 Y,[(xi-a) 2 + (y l -b) 2 -R 2 } 2 -> min (4.6) 



i=l 



(it is based on the approximation (x« — a) 2 + (yi — b) 2 ~ R 2 ), and the other due to Agin 
(Tj and Taubin 



J?(a, b, R) = _ + _ - a) 2 + fa - &) 2 - R 2 ] 2 - min (4.7) 
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(here one simply averages the denominator of (|4.4|) over 1 < % < n). We refer the reader 
to [H] for a detailed analysis of these and other circle fitting algorithms, including their 
numerical implementations. 

We have tested experimentally the efficiency of four circle fitting algorithms: the 
OLSF (JHEJ), the simple AF IfP) . the Pratt method IfOty . and the Taubin method (jOJ). 
We have generated n = 20 points equally spaced on a circle, added an isotropic Gaussian 
noise with variance a 2 (according to the Cartesian model), and estimated the efficiency 
of the estimate of the center by 

E = (4.8) 
{(a - af + (6-6) 2 ) 

Here (a, 6) is the true center, (a, 6) is its estimate, (• • •) denotes averaging over many 
random samples, and T>n, T>22 are the first two diagonal entries of the matrix (|4.5j) . 
Table 1 shows the efficiency of the above mentioned four algorithms for various values 
of a/R. We see that they all perform very well, and indeed are efficient as a — > 0. One 
might notice that the OLSF slightly outperforms the other methods, and the AF is the 
second best. 



a/R 


OLSF 


AF 


Pratt 


Taubin 


< 0.01 


~ 1 


~ 1 


~ 1 


~ 1 


0.01 


0.999 


0.999 


0.999 


0.999 


0.02 


0.999 


0.998 


0.997 


0.997 


0.03 


0.998 


0.996 


0.995 


0.995 


0.05 


0.996 


0.992 


0.987 


0.987 


0.10 


0.985 


0.970 


0.953 


0.953 


0.20 


0.935 


0.900 


0.837 


0.835 


0.30 


0.825 


0.824 


0.701 


0.692 



Table 1. Efficiency of circle fitting algorithms. Data are sampled along a full circle. 

Table 2 shows the efficiency of the same algorithms as the data points are sampled 
along half a circle, rather than a full circle. Again, the efficiency as a —>■ is clear, but we 
also make another observation. The AF now consistently falls behind the other methods 
for all a/R < 0.2, but for a/R = 0.3 the others suddenly break down, while the AF keeps 
afloat. 
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a/R 


OLSF 


AF 


Pratt 


Taubin 


< 0.01 


~ 1 


~ 1 


~ 1 


~ 1 


0.01 


0.999 


0.996 


0.999 


0.999 


0.02 


0.997 


0.983 


0.997 


0.997 


0.03 


0.994 


0.961 


0.992 


0.992 


0.05 


0.984 


0.902 


0.978 


0.978 


0.10 


0.935 


0.720 


0.916 


0.916 


0.20 


0.720 


0.493 


0.703 


0.691 


0.30 


0.122 


0.437 


0.186 


0.141 



Table 2. Efficiency of circle fitting algorithms with data sampled along half a circle. 

The reason of the above turnaround is that at large noise the data points may occa- 
sionally line up along a circular arc of a very large radius. Then the OLSF, Pratt and 
Taubin dutifully return a large circle whose center lies far away, and such fits blow up 
the denominator of (|4.8|) . a typical effect of large outliers. On the contrary, the AF is 
notoriously known for its systematic bias toward smaller circles (HI El EZj , hence while 
it is less accurate than other fits for typical random samples, its bias safeguards it from 
large outliers. 

This behavior is even more pronounced when the data are sampled along quarter 1 of 
a circle (Table 3). We see that the AF is now far worse than the other fits for a/R < 0.1 
but the others characteristically break down at some point (a/R = 0.1). 



a/R 


OLSF 


AF 


Pratt 


Taubin 


0.01 


0.997 


0.911 


0.997 


0.997 


0.02 


0.977 


0.722 


0.978 


0.978 


0.03 


0.944 


0.555 


0.946 


0.946 


0.05 


0.837 


0.365 


0.843 


0.842 


0.10 


0.155 


0.275 


0.163 


0.158 



Table 3. Data are sampled along a quarter of a circle. 

It is interesting to test smaller circular arcs, too. Figure 1 shows a color-coded diagram 
of the efficiency of the OLSF and the AF for arcs from 0° to 50° and variable a (we set 
a = ch, where h is the height of the circular arc, see Fig. 2, and c varies from to 0.5). 
The efficiency of the Pratt and Taubin is virtually identical to that of the OLSF, so it is 
not shown here. We see that the OLSF and AF are efficient as a — > (both squares in 
the diagram get white at the bottom), but the AF loses its efficiency at moderate levels 

1 All our algorithms are invariant under simple geometric transformations such as translations, rota- 
tions and similarities, hence our experimental results do not depend on the choice of the circle, its size, 
and the part of the circle the data are sampled from. 
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of noise (c > 0.1), while the OLSF remains accurate up to c = 0.3 after which it rather 
sharply breaks down. 




Figure 1: The efficiency of the simple OLSF (left) and the AF (center). The bar on the 

right explains color codes. 

The following analysis sheds more light on the behavior of the circle fitting algorithms. 
When the curvature of the arc decreases, the center coordinates a, b and the radius R 
grow to infinity and their estimates become highly unreliable. In that case the circle 
equation (J4.1j) can be converted to a more convenient algebraic form 

A(x 2 + y 2 ) + Bx + Cy + D = (4.9) 

with an additional constrain on the parameters: B 2 +C 2 — AAD = 1. This parametrization 
was used in [TTJEh an d analyzed in detail in jU]. We note that the original parameters 
can be recovered via a = —B/2A, b = —C/2A, and R = (2 |t4|) _1 . The new parametriza- 
tion ()4.9|) is safe to use for arcs with arbitrary small curvature: the parameters A, B, C, D 
remain bounded and never develop singularities, see jH] • Even as the curvature vanishes, 
we simply get A = 0, and the equation ()4.9j) represents a line Bx + Cy + D = 0. 



<3 = ch 




Figure 2: The height of an arc, h, and our formula for cr. 
In terms of the new parameters A, B,C, D, the weighted algebraic fit (jl.3|) takes form 

n 

J r 3 (A,B,C,D) = Y / w i [A(x 2 + y 2 ) + Bx + Cy + D] 2 -> min (4.10) 

i=i 
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(under the constraint B 2 +C 2 — 4AD = 1). Converting the AF (|4.2|) to the new parameters 
gives 

n 

F 2 (A,B,C,D)=J2A- 2 lA(x 2 + y 2 ) + Bx + Cy + D} 2 -> min (4.11) 
i=i 

which corresponds to the weight function w = I /A 2 . The Pratt method (j4.fi)) turns to 

n 

F i (A 1 B 1 C 1 D) = Y}A(x 2 + y 2 )+Bx + Cy + D} 2 -> min (4.12) 

i=l 

We now see why the AF is unstable and inaccurate for arcs with small curvature: its 
weight function w = 1/A 2 develops a singularity (it explodes) in the limit A — > 0. Recall 
that, in our derivation of the statistical efficiency theorem (Section 3), we assumed that 
the weight function was regular (had bounded derivatives). This assumption is clearly 
violated by the AF (|4.11jl . On the contrary, the Pratt fit ()4.12|) uses a safe choice w = 1 
and thus behaves decently on arcs with small curvature, see next. 




Figure 3: The efficiency of the simple AF (left) and the Pratt method (center). The bar 

on the right explains color codes. 

Figure 3 shows a color-coded diagram of the efficiency of the estimate of the param- 
eter 2 A by the AF (f4~TT]l versus Pratt (|4.12j) for arcs from 0° to 50° and the noise level 
a = ch, where h is the height of the circular arc and c varies from to 0.5. The efficiency 
of the OLSF and the Taubin method is visually indistinguishable from that of Pratt (the 
central square in Fig. 3), so we did not include it here. 

We see that the AF performs significantly worse than the Pratt method for all arcs 
and most of the values of c (i.e., a). The Pratt's efficiency is close 100%, its lowest point 
is 89% for 50° arcs and c = 0.5 (the top right corner of the central square barely gets 
grey). The AF's efficiency is below 10% for all c > 0.2 and almost zero for c > 0.4. Still, 

2 Note that |^4| = 1/2R, hence the estimation of A is equivalent to that of the curvature, an important 
geometric parameter of the arc. 
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the AF remains efficient as o — > (as the tiny white strip at the bottom of the left square 
proves), but its efficiency can be only counted on when a is extremely small. 

Our analysis demonstrates that the choice of the weights Wi in the weighted algebraic 
fit should be made according to our theorem in Section 3, and, in addition, one 

should avoid singularities in the domain of parameters. 



Appendix 

Here we prove the theorem of linear algebra stated in Section El For the sake of clarity, 
we divide our proof into small lemmas: 

Lemma 1. The matrix B is indeed nonsingular. 

Proof. If Bz = for some nonzero vector z G R fc , then = z T Bz = Yl^iivf z ) 2 / \\ u i\\ 2 , 
hence vfz = for all 1 < % < k, a contradiction. 

Lemma 2. // a set of n matrices A\, . . . , A n is proper, then rank(Ai) < 1. Furthermore, 
each Ai is given by Ai = ZiuJ for some vector zi G H fc , and the vectors z%,...,z n satisfy 
Yh=i z i v T = ~I where I is the k x k identity matrix. The converse is also true. 

Proof. Let vectors w\, . . . , w n and r satisfy the requirements ()2.10|) and ()2.11|) of the 
theorem. Consider the orthogonal decomposition u>; = CjWj + wf where wj- is perpendic- 
ular to Ui, i.e. ujw^ = 0. Then the constraint ()2.11|) can be rewritten as 



T 

v- r 

T 



(a.i; 



for all i = 1, . . . , n and ()2.10|) takes form 



CiA iUi + M 1 = r ( A - 2 ) 
1=1 1=1 

We conclude that Aiwf = for every vector orthogonal to Ui, hence Ai has a {k — 1)- 
dimensional kernel, so indeed its rank is zero or one. If we denote Z{ = AiUi/\\ui\\ 2 , we 
obtain Ai = Ziuf. Combining this with (|A.l|) - (jA.2|) gives 

n / n \ 

T = ~ Y.( V I r ') Z i = _ I] Z i V i r 
8=1 \j=l / 

Since this identity holds for any vector r G R fc , the expression within parentheses is — /. 
The converse is obtained by straightforward calculations. Lemma is proved. 

Corollary. Let m = Then AjmnfAj = AiAj for each i. 

This corollary implies our lemma stated in Section |21 We now continue the proof of 
the theorem. 
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Lemma 3. The sets of proper matrices make a linear variety, in the following sense. Let 
A[, . . . ,A' n and A'{, . . . , A" n be two proper sets of matrices, then the set A ± , . . . , A n defined 
by Ai = A[ + c(A" — A'A is proper for every cGR 

Proof. According to the previous lemma, A\ = z[uj and A" = z"uj for some vectors 
z'i, z" , 1 < i < n. Therefore, A^ = ZiuJ for Zi = z[ + c{z" — z r ^). Lastly, 

n n n n 

E = E <vf + c e <vj - c e z[vj = -i 

i=l i=l i=l i=l 

Lemma is proved. 

Lemma 4. // a set of n matrices A ± , . . . , A n is proper, then Yh=i AiXj = —I , where I 
is the k x k identity matrix. 

Proof. By using Lemma 2 J27=i AiXf = Yh=i z, i v J = Lemma is proved. 

Lemma 5. We have indeed D > B^ 1 . 



Proof. For each % = 1, . . . , n consider the 2k x m matrix Yi = 




Using the 



previous lemma gives 

By construction, this matrix is positive semidefmite. Hence, the following matrix is also 
positive semidefmite: 

I B- 1 \ ( D -I \ ( I \_fD-B- 1 \ 
b- 1 ) \ -I B jyB- 1 B- 1 ) ~\ B- 1 ) 

By Sylvester's theorem, the matrix D — B^ 1 is positive semidefmite. 

Lemma 6. The set of matrices A° = —B~ x Xi is proper, and for this set we have 
D = B-\ 

Proof. Straightforward calculation. 

Lemma 7. If D — B^ 1 for some proper set of matrices A ± , . . . , A n , then Ai = A° for all 
1 < % < n. 

Proof. Assume that there is a proper set of matrices A' 1) ... 1 A' n1 different from 
A°,..., A°, for which D = B' 1 . Denote 5A, t = A\ - A°. By Lemma 3, the set of 
matrices ^(7) = A° + 7(<L4j) is proper for every real 7. Consider the variable matrix 

n 

D{l) = E[A(7)][A(7)] T 
i=i 

n / n n \ n 

i=l \i=l i=l / i=l 

Note that the matrix R = £™ =1 A°(5Ai) T + Y^ =l {5Ai)(A°) T is symmetric. By Lemma 5 
we have ^(7) > B^ 1 for all 7, and by Lemma 6 we have -D(O) = B~ x . It is then easy 
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to derive that R = 0. Next, the matrix S = Y,?=i(fiAi)(5Ai) T is symmetric positive 
semidefinite. Since we assumed that D(l) = D(0) = B' 1 , it is easy to derive that S = 
as well. Therefore, 5Ai = for every i — 1, . . . , n. The theorem is proved. 
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